A contrast sensitivity model of the human visual system in modern conditions for presenting video content

Digital video incurs many distortions during processing, compression, storage, and transmission, which can reduce perceived video quality. Developing adaptive video transmission methods that provide increased bandwidth and reduced storage space while preserving visual quality requires quality metrics that accurately describe how people perceive distortion. A severe problem for developing new video quality metrics is the limited data on how the early human visual system simultaneously processes spatial and temporal information. The problem is exacerbated by the fact that the few data collected in the middle of the last century do not consider current display equipment and are subject to medical intervention during collection, which does not guarantee a proper description of the conditions under which media content is currently consumed. In this paper, the 27840 thresholds of the visibility of spatio-temporal sinusoidal variations necessary to determine the artefacts that a human perceives were measured by a new method using different spatial sizes and temporal modulation rates. A multidimensional model of human contrast sensitivity in modern conditions of video content presentation is proposed based on new large-scale data obtained during the experiment. We demonstrate that the presented visibility model has a distinct advantage in predicting subjective video quality by testing with video quality metrics and including our and other visibility models against three publicly available video datasets.

We present the manuscript that meets PLOS ONE's style requirements, including those for file naming.
2. We note that your Data Availability Statement is currently as follows: [All relevant data are within the manuscript and its Supporting Information files.]Please confirm at this time whether or not your submission contains all raw data required to replicate the results of your study.Authors must share the "minimal data set" for their submission.PLOS defines the minimal data set to consist of the data required to replicate all study findings reported in the article, as well as related metadata and methods (https://journals.plos.org/plosone/s/data-availability#loc-minimal-data-set-definition).For example, authors should submit the following data: -The values behind the means, standard deviations and other measures reported; -The values used to build graphs; -The points extracted from images for analysis.Authors do not need to submit their entire data set if only a portion of the data was used in the reported study.If your submission does not contain these data, please either upload them as Supporting Information files or deposit them to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers.For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories.If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee).Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.If data are owned by a third party, please indicate how others may request data access.
We confirm at this time our submission contains all raw data required to replicate the results of our study, https://github.com/extezo/plos-one.
We submit the following data: -The values behind the means and standard deviations, -The values used to build graphs, -The points extracted from images for analysis.
The ethical restrictions on sharing a de-identified data set exist and are imposed by the HECS Human Ethics Committee University of Waikato (Contact point for enquiries Lois Vuursteen lois.vuursteen@waikato.ac.nz ).
The explanation of ethical restrictions on sharing a de-identified data set presented in the open repository: Data collected from the survey must be stored on university computers.The identities of participants are kept confidential and do not relate to their results.Results can be used only for research dissemination at conferences, journals, research thesis, and teaching resources.For the purposes presented above, data can be obtained by request at the email address anast.mozhaeva@gmail.com.
3. Your ethics statement should only appear in the Methods section of your manuscript.If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section.Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.
We changed the name of the part from Procedure to Method, since previously there was no Method part.The ethics statement now appears in the Methods section.

Modified text, line 217:
HECS-20-64, HECS-20-58 where 21 December 2020 was the start, and 29 January 2021 was the end of the recruitment period for this study.Written informed consent was obtained from all participants.4. We note that Figure 4 and 10 in your submission contain copyrighted images.All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution.For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.
We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission: a.You may seek permission from the original copyright holder of Figure 4  We have replaced the images in both cases.
5. Please include a copy of Table 2 which you refer to in your text on page 10.
Our apologies, this was a typographical error.

Modified text, line 308:
We report the coefficients of the defined approximation polynomials on our open repository.
6. Please review your reference list to ensure that it is complete and correct.If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references.Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice.

Reviewer #1:
The authors created a setup for evaluating the detection threshold of spatio-temporal artefacts in movies and developed a mathematical model to predict those threshold.Furthermore, they conducted a comprehensive study to collect data, which in turn was used to test the model.Finally, the results were compared to existing models.This is really an impressive piece of work and I would like to congratulate the authors to it!
We thank the reviewer for the positive appraisal of our work.
Accordingly, I only have some minor comments: We use a linear grid of average screen brightness values, which is suitable for further analysis with existing visibility models [22].The minimum and maximum average brightness values used ensure that the test pattern is not distorted by the hard limits of the eight-bit brightness value limits.
Modified text in Discussion, beginning on line 413: During the experiment, we considered increasing the number of brightness points, but the degree of nonlinearity was not adequately large enough to require more.We report the coefficients of the defined approximation polynomials on our open repository.

Reviewer #2:
The summary of the research and overall impression is good, however, there is a lot of scope for improvement in the article for a better understanding of the reader.The strength of the article is the success in creating the model as expected, somewhere the methodology has been made complex by connecting it to the introduction part.This part has to be reworked majorly in the current article.
The authors have made a good start in the abstract and introduction part of the article, however, in comparing the research question to the previously published article, the results of the current study have also been discussed.
Thank you for your comment.
We reworked the methodology and introduction part in response to the suggestions made by both reviewers.Line 75-80 require a reframing of the sentence for a better understanding of the reader.

Modified text, line 75:
The CSF is a bandpass filter that passes only visual stimuli that the observer can perceive.Only video artefacts in the passband region can be perceived by humans, hence the importance of CSF.Extant datasets on CSF have been collected with limitations.Datasets were collected firstly as a temporary measurement threshold [13]; secondly as the threshold for spatial contrast sensitivity only [14,15]; and simultaneous spatial and temporal measurements using targets that were sinusoidal in both space and time [16 Today, CSF research still does not present data sets with dependent spatial and temporal frequency measurements.
Line 117, the same word has been repeated?Modified text, line 116: Each participant requires about 30 minutes to perform 960 evaluations of various interactions of luminance, spatial, and temporal frequency values.
While the study appears to be good, I advise the authors to rework the methodology part of the study, giving better clarity for the current work being published.Needs clarification on what basis the sample size for both the 1st and 2nd experiments was decided.

Modified text, line 253:
All experiments continued until the confidence interval fell below 5% of the current mean value for each point across participants.Consequently, the basis of the sample size for the 1st and 2nd experiments was decided according to Rec.ITU-T Bt.500-15 standard [32].
The methodology followed for the comparison of the model needs an explanation of the procedure/methods.
We provided an explanation of the procedure followed for the comparison of the models.

Modified text, line 331:
All three models generate time coefficients that change with the frame, which are then passed to the VQM to calculate the score value.We averaged quality predictions across all frames in each video sequence in the data sets.We fit a non-linear model to each metric that maps from predictions of the average human scores in the open datasets.We report the prediction results as the Pearson correlation coefficient (PLCC).We also report the average of PLCCs computed per dataset, including upper and lower bounds.
Values excluded from the study are missing in the result part, although has been discussed in the discussion part with justification.

Modified text, line 288:
It should also be noted that participants noticed flickering at the higher frequencies tested at 15, 40 Hz (pixel brightness level is 200), at 36 Hz (pixel brightness level is 160), 13, 20 and 48 Hz (pixel brightness level is 80); 40 and 46 Hz (pixel brightness level is 40).These values were reasonably excluded from the analysis; details are discussed below.

Modified text, line 428:
Based on the exact repeatability of the frequencies of these anomalies, we assumed that the recorded effect is associated more with the test equipment's imperfections than with the visual system's properties.We did additional measurements using a photosensor and an oscilloscope, but no spurious frequencies were detected in the generated stimulus.Also, it must be considered that at these frequencies, the contrast sensitivity of vision becomes commensurate with the value of the quantization noise of the display device and the PWM backlight controller, as well as with the sampling parameters of the measuring equipment.We excluded these changes when processing the results and provided the experimental data for further research.
Points in lines 295-297 can be a part of the discussion.
Thank you for the suggestion, but in our opinion, it would be more understandable for readers if we presented this information in the results section.We have written PLCC as Pearson Linear Correlation Coefficient in Table .Discussion can be elaborated, by moving some of the points from the previous parts of the article.Authors should have a comparison of the current model with the previous ones in the discussion part as this is not one of the main objectives as per the title.
Following the first reviewer's comments, we elaborated on the Discussion part.The two last paragraphs from Related Work were moved to the discussion.Also, following your suggestion, a comparison of the current model with the previous models was moved to the Discussion part, including Fig. 9, Fig. 10 and associated exposition in the text.
Limitations of the study can be discussed.
We added information on the Limitations of the study in the Discussion part.

Modified text, line 449:
Limitations of the experiment include imperfections in the test apparatus and the measurement process.Due to the limitation of contrast and maximum screen pixel brightness, the characteristics of vision in the state of its full adaptation were studied.Changes in the mode of a non-adapted HVS, which can be used, for example, when observing high-contrast scenes in HDR systems, require an appropriate stimulus generation device.
Full data sheet access has not been attached to the article.
Our submission contains all raw data required to replicate the results of our study, https://github.com/extezo/plos-one: -The values behind the means and standard deviations, -The values used to build graphs, -The points extracted from images for analysis, -The coefficients of the defined approximation polynomials of s(k, f, 120) derivatives.
The reference style has to be reworked as there is a discrepancy in the list of references.
The changes to the references are presented above in response to the journal requirements.
line 73: reference Cecchi [2018] is missing Modified text, line 72: (More recent definitions of early vision, for example, that given by Cecchi [11], include computation of basic properties like shape and color.)line 118 -155: These paragraph my better fit into the discussion section.The paragraph moved to the discussion section.line 185: Why were especially these brightness values chosen?Modified text, line 148:

Figure 1
Figure 1 to 5 requires citation and repositioning in the text of the article.

Figure 1
Figure 1 repositioned to Model comparison part as Figure 9, second paragraph.

Figure 2 is repositioned as Figure 1 .
Figure 2 is repositioned as Figure 1.Modified text, label of, now, Figure 1: The stimulus used to generate Mira.The spectrum of the test signal, Normalized frequency is the digital frequency of the spectrum.The horizontal dashed line represents the spatial sensitivity threshold [26].

Figure 3
Figure 3 is repositioned as, Figure 2. Modified text, label of, now, Figure 2: Structural diagram of the display system installation for research [26].

Figure 4
Figure 4 is repositioned as Figure 3.

Figure 5
Figure 5 is repositioned as Figure 4. Modified text, line 184: Using the presented equipment, we measured the relationship between the value of the specified pixel brightness and the screen's actual brightness used in the experiment.Fig 4 shows how it is possible to correspond the pixel brightness values used in the proposed work with the generally accepted brightness in candela per meter.

Figure 6
Figure 6 repositioned to Results part as Figure 5, first paragraph.
Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties.Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form." In the figure caption of the copyrighted figure, please include the following text: "Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year]."b.If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder's requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license.Please check copyright information on all replacement figures and update the figure caption with source information.If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

Table 1 ,
Line 363, is that Table 1?In the content, it has been written as table 2, but there is no table 2 found in the present article.We report the coefficients of the defined approximation polynomials on our open repository.representation of data in comparison to other models, PLCC need to be written as Pearson Linear Correlation Coefficient.